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Abstract: 

An atomic protein model with a minimalistic potential is developed and then tested on 
an a-helix and a /3-hairpin, using exactly the same parameters for both peptides. We 
find that melting curves for these sequences to a good approximation can be described 
by a simple two-state model, with parameters that are in reasonable quantitative 
agreement with experimental data. Despite the apparent two-state character of the 
melting curves, the energy distributions are found to lack a clear bimodal shape, 
which is discussed in some detail. We also perform a Monte Carlo-based kinetic 
study and find, in accord with experimental data, that the a-helix forms faster than 
the /3-hairpin. 
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1 Introduction 



Simulating protein folding at atomic resolution is a challenge, but no longer compu- 
tationally impossible, as shown by recent studies [T1I2] of Go-type [3] models with a 
bias towards the native structure. Extending these calculations to entirely sequence- 
based potentials remains, however, an open problem, due to well-known uncertainties 
about the form and relevance of different terms of the potential. In this situation, it 
is tempting to look into the properties of atomic models that are sequence-based and 
yet as simple and transparent as possible; for an example, see Kussell et al. 

The development of models for protein folding is hampered by the fact that short 
amino acid sequences with protein-like properties are rare, which makes the calibra- 
tion of potentials a non-trivial task. Breakthrough experiments in the past ten years 
have, however, found examples of such sequences. Of particular importance was the 
discovery of a peptide making /5-structure on its own [Sj, the second /?-hairpin from the 
protein G Bl domain, along with the finding that this 16-amino acid chain, like many 
small proteins, show two-state folding jHj. These experiments have stimulated many 
theoretical studies of the folding properties of this sequence, including simulations 
of atomic models with relatively detailed semi-empirical potentials [T|l8|l9|IT( H ITT | IT2] . 
Reproducing the melting behavior of the /^-hairpin has, however, proven non-trivial, 
as was recently pointed out by Zhou et al. [T2j. 

Here we develop and explore a simple sequence-based atomic model, which is found 
to provide a surprisingly good description of the thermodynamic behavior of this 
peptide. The same model, with unchanged parameters, is also applied to an a-helical 
peptide, the designed so-called Fg peptide with 21 amino acids [T!^IT^. We find that 
this sequence indeed makes an a-helix in the model, and our results for the stability 
of the helix agree reasonably well with experimental data ^lElElCSl- Finally, we 
also study Monte Carlo-based kinetics for both these peptides. Here we investigate 
the relaxation of ensemble averages at the respective melting temperatures. 
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2 Model and Methods 



2.1 The Model 



Recently, we developed a simple sequence-based model with 5-6 atoms per amino 
acid for helical proteins (JiJ^il^. Here we extend that model by incorporating all 
atoms. The interaction potential is deliberately kept simple. The chain representa- 
tion is, by contrast, detailed; in fact, it is more detailed than in standard "all- atom" 
models as all hydrogens are explicitly included. The presence of the hydrogens has 
the advantage that local torsion potentials can be avoided. All bond lengths, bond 
angles and peptide torsion angles (180°) are held fixed, which means that each amino 
acid has the Ramachandran torsion angles 0, ip and a number of side-chain torsion 
angles as its degrees of freedom (for Pro, is held fixed at —65°). The geometry 
parameters held constant are derived by statistical analysis of Protein Data Bank 
(PDB) j2D] structures. A complete list of these parameters can be found as supple- 
mental material. 



The potential function 

E = i^ev + -E'hb + -E'hp (1) 

is composed of three terms, representing excluded-volume effects, hydrogen bonds 
and effective hydrophobicity forces (no explicit water), respectively. The remaining 
part of this section describes these different terms. Energy parameters are quoted in 
dimensionless units, in which the melting temperature Tm, defined as the specific heat 
maximum, is given by kT^ = 0.4462 ± 0.0014 for the /5-hairpin. In the next section, 
the energy scale of the model is set by fixing Tm for this peptide to the experimental 
midpoint temperature, Tm = 297 K [H]. 



The excluded-volume energy Eev is given by 

i<j 

where = 0.10 and = 1.77, 1.71, 1.64, 1.42 and 1.00 A for S, C, N, O and 
H atoms, respectively. Our choice of cij values is guided by the analysis of Tsai et 
al. [21] . The parameter Ajj in Eq. |21 reduces the repulsion between non-local pairs; 
\ij = 1 for all pairs connected by three covalent bonds and for HH and 00 pairs from 
adjacent peptide units, and Ajj = 0.75 otherwise. The pairs for which Ajj = 1 strongly 
influence the shapes of Ramachandran maps and rotamer potentials. The reason for 
using Xij < 1 for the large majority of all pairs is both computational efficiency and 



Aij{(Ti + (Tj) 



(2) 
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the restricted flexibility of chains with only torsional degrees of freedom. To speed 
up the calculations, the sum in Eq. |21 is evaluated using a pair dependent cutoff 



4 = 4.3A., A. 



The hydrogen-bond energy i?hb has the form 



j<i~2 
or ;7>z + l 



where e^^ = 3.1, e^y^ = 2.0 and the functions u and v are given by 

. 5(^f)"-6(^f)'° (4, 

, f (cos a COS ifa,/3>90° 

= |o otherwise 

The first sum in Eq. El represents backbone-backbone hydrogen bonds. Term ij in 
this sum is an interaction between the NH and CO groups of amino acids i and j, 
respectively, r^j denotes the HO distance, and aij and (3ij are the NHO and HOC 
angles, respectively. The second sum in Eq. 01 is expressed in a schematic way. It 
represents interactions between oppositely charged side chains, and between charged 
side chains and the backbone. Both these types of interaction are, for convenience, 
taken to have the same form as backbone-backbone hydrogen bonds. The side chain 
atoms that can act as "donors" or "acceptors" in these interactions are the N atoms 
of Lys and Arg (donors) and the O atoms of Asp and Glu (acceptors). The second 
sum in Eq. O has a relatively weak influence on the thermodynamic behavior of the 
systems studied. The backbone-backbone hydrogen bonds are, by contrast, crucial 
and their strength, e^^, must be carefully chosen [TS] . 

The functional form of the hydrogen-bond energy differs from that in our helix 
model fZl^inni in that the exponent of the cosines is 1/2 instead of 2. The reason 
for this change is that the /3-hairpin turned out to become too regular when using 
the exponent 2; the exponent 1/2 gives a more permissive angular dependence. The 
function u{r) in Eq. HI is calculated using a cutoff = 4.5 A and a^h = 2.0 A. 

The last term of the potential, the hydrophobicity energy E'hp, assigns to each amino 
acid pair an energy that depends on the amino acid types and the degree of contact 
between the side chains. It can be written as 

Ei,^ = e^^J2MijCjj, (6) 

where ehp = 1.5, and the sum runs over all possible amino acid pairs IJ except 
nearest neighbors along the chain. In the present study, the M/j's (< 0) are given 
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Ala Val Leu He Phe Tyr Trp Met 
Ala 0.00 0.44 1.31 0.98 1.21 0.00 0.22 0.34 



Val 1.92 2.88 2.45 2.69 1.02 1.58 1.72 

Leu 3.77 3.44 3.68 2.07 2.54 2.81 

He 2.94 3.24 1.65 2.18 2.42 

Phe 3.66 2.06 2.56 2.96 

Tyr 0.57 1.06 1.31 

Trp 1.46 1.95 

Met 1.86 



Table 1: The interaction matrix M/j, based on the shifted contact-energy matrix of 
Miyazawa and Jernigan ^22j. The table shows absolute values (M/j < 0). 



by the contact energies of Miyazawa and Jernigan |221 shifted to zero mean, provided 
that the amino acids / and J both are hydrophobic and that the shifted contact 
energy is negative; otherwise, Mjj = 0. The statistical Miyazawa- Jernigan energies 
contain, of course, other contributions too, but receive a major contribution from 
hydrophobicity ^31- The matrix Mjj is given in Tabled Eight of the amino acids 
are classified as hydrophobic, namely Ala, Val, Leu, He, Phe, Tyr, Trp and Met. The 
geometry factor Cjj in Eq. [Ul is a measure of the degree of contact between amino 
acids / and J. To define C/j, we use a predetermined set of Nj atoms, denoted 
by Aj, for each amino acid /. For Phe, Tyr and Trp, the set Aj consists of the C 
atoms of the hexagonal ring. The other five hydrophobic amino acids each have an 
Aj containing all its non-hydrogen side-chain atoms. With these definitions, Cjj can 
be written as 



Cu ' 



Ni + Nj 



/(min4)+ J2 /(mill 4; 



(7) 



where the function /(x) = 1 if a; < A, j[x) = if x > -B, and /(x) = {B — x)j(B — A) 
\i A < X < B \A = (3.5 A)^ and B = (4.5 A)^]. Roughly speaking, Cjj is a measure 
of the fraction of atoms in Aj or Aj that are in contact with some atom from the 
opposite side chain. 



2.2 Numerical Methods 



To study the thermodynamic behavior of this model, we use the simulated-tempering 
method PHI^I^ . in which the temperature is a dynamical variable. This method 
is chosen in order to speed up the calculations at low temperatures. Our simulations 
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are started from random configurations, and eiglit different temperatures are studied, 
ranging from 273 K to 366 K. 

The temperature jump is always to a neigliboring temperature and subject to a 
Metropolis accept/reject question [2Zj- For the backbone degrees of freedom, we 
use three different elementary moves: first, the pivot move [2Hj in which a single 
torsion angle is turned; second, a semi-local method [20] that works with seven or 
eight adjacent torsion angles, which are turned in a coordinated way; and third, a 
symmetry-based update of three randomly chosen backbone torsion angles, referred 
to as the mirror update. All updates of side-chain angles and the pivot move are 
Metropolis updates of a single angle, in which the proposed angle is drawn from the 
uniform distribution between 0° and 360°. To see how the mirror update works, 
consider the three bonds corresponding to the randomly chosen torsion angles. The 
idea is then to reflect the mid bond in the plane defined by the two others, keeping the 
directions of these two other bonds fixed. Both this update and the pivot move are 
non-local. They are included in our thermodynamic calculations in order to accelerate 
the evolution of the system at high temperatures. The ratio of attempted temperature 
moves to conformation moves is 1:100. 70% of the conformation moves are side- 
chain moves. The relative ratios of attempts for the three types backbone moves is 
temperature dependent. The pivot : semi- local : mirror ratio varies from 1 : 4 : 1 at the 
lowest temperature to 5 : : 1 at the highest temperature. 

Our kinetic simulations are also Monte Carlo-based, and only meant to mimic the time 
evolution of the system in a qualitative sense. They differ from our thermodynamic 
simulations in two ways: first, the temperature is held constant; and second, the two 
non-local backbone updates are not used, but only the semi-local method ISHj. This 
restriction is needed in order to avoid large unphysical deformations of the chain. For 
the side-chain degrees of freedom, we use a Metropolis step in which the angle can 
change by any amount (same as in the thermodynamic runs). Thus, it is assumed 
that the torsion angle dynamics are much faster for the side chains than for the 
backbone. 

In our thermodynamic analysis, statistical errors are obtained by analyzing data 
from ten independent runs, each containing 10^ elementary steps and several fold- 
ing/unfolding events. All errors quoted are la errors. All fits of data discussed in 
the next section are carried out by using a Levenberg-Marquardt procedure |30j . 
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3 Results and Discussion 



Using the model described in the previous section, we first study the second /3-hairpin 
from the protein G Bl domain (amino acids 41-56). Blanco et al. j3] analyzed this 
peptide in solution by NMR and found that the excised fragment adopts a structure 
similar to that in the full protein, although the NMR restraints were insufficient to 
determine a unique structure. In our calculations, in the absence of a complete struc- 
ture for the isolated fragment, we monitor the root-mean-square deviation (rmsd) 
from the native /3-hairpin of the full protein (PDB code IGBl, first model), as deter- 
mined by NMR . The native /3-hairpin contains a hydrophobic cluster consisting 
of Trp43, Tyr45, Phe52 and Val54. There is experimental evidence [^2] that this 
cluster as well as sequence-specific hydrogen bonds in the turn are crucial for the 
stability of the isolated /3-hairpin. 

Fig.^ shows the free energy -F(A, E) as a function of rmsd from the native /3-hairpin, 
A, and energy, E, at the temperature T = 273 K. For a /5-hairpin there are two 
topologically distinct states with similar backbone folds but oppositely oriented side 
chains. The global minimum of -F(A, E) is found at 2-4 A in A and corresponds to a 
/3-hairpin with the native topology and the native set of hydrogen bonds between the 
two strands. The main difference between structures within this minimum lies in the 
shape of the turn. The precise shape of the /9-hairpin is, not unexpectedly, sensitive 
to details of the potential; in particular, we find that the second term in Eq. IHldoes 
infiuence the shape of the turn, while having only a small effect on thermodynamic 
functions such as i?hp- Therefore, it is not unlikely that a more detailed potential 
would discriminate between different shapes of the turn, and thereby make the free- 
energy minimum more narrow. 

Besides its global minimum, F{A,E) exhibits two local minima (see Fig. ^), one 
corresponding to a /3-hairpin with the non-native topology ( A ^ 5 A) , and the other 
to an a-helix (A ^ 10 A). A closer examination of structures from the two /5-hairpin 
minima reveals that the Cp-Cjs distances for Tyr45-Phe52 and Trp43-Val54 tend to be 
smaller in the non-native topology than in the native one. This is important because 
it makes it sterically difficult to achieve a proper contact between the aromatic side 
chains of Tyr45 and Phe52 in the non-native topology. As a result, this topology is 
hydrophobically disfavored. This is the main reason why the model indeed favors the 
native topology over the non-native one. 

We now turn to the melting behavior of the /3-hairpin. By studying tryptophan 
fiuorescence (Trp43), Munoz et al. j6j found that the unfolding of this peptide with 
increasing temperature shows two-state character, with parameters = 297 K and 
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Figure 1: Free energy F(A, E) = -kT In P(A, E) at T = 273 K for (a) the /3-hairpin 
and (b) the Fg peptide. E is energy and A denotes rmsd from the native /?-hairpin 
and an ideal a-hehx, respectively, calculated over all non-hydrogen atoms (a backbone 
rmsd would be unable to distinguish the two possible /?-hairpin topologies). 



AE = 11.6 kcal/mol, and AE being the melting temperature and energy change, 
respectively. To study the character of the melting transition in our model, we 
monitor the hydrophobicity energy E^p, a simple observable we expect to be strongly 
correlated with Trp43 fluorescence. Following Mufioz et al. [6], we fit our data for i?hp 
to a first-order two-state model. To reduce the number of parameters of the fit, is 
held fixed, at the specific heat maximum (data not shown). The fit turns out not to 
be perfect, with a x^/dof of 4.5. The deviations from the fitted curve are nevertheless 
small, as can be seen from Fig. they can be detected only because the statistical 
errors are very small (~ 0.1 %) at the highest temperatures. To further illustrate this 
point, we assign each data point an artifical uncertainty of 1 %, an error size that is 
not uncommon for experimental data. With these errors, the same type of fit yields 
a x^/dof of 0.3, which confirms that the data indeed to a good approximation show 
two-state behavior. Our fitted value of AE is 9.3 ± 0.3 kcal/mol, which implies that 
the temperature dependence of the model is comparable to experimental data [H]. 

Several groups have simulated the same /?-hairpin using atomic models with im- 
plicit [3 El El or explicit IH [10111111121 solvent. Many of these groups studied the 
melting behavior of the /5-hairpin, but the temperature dependence they found was 
too weak, as was pointed out by Zhou et al. [12]. In fact, in all these studies, there 
was a significant /?-hairpin population at temperatures of 400 K and above. Another 
important difference between at least some of these models |HlllO|lllj and ours, is that 
in our model there is no clear free-energy minimum corresponding to a hydrophobi- 
cally collapsed state with few or no hydrogen bonds. A local free-energy minimum 
with helical content was found in one of these studies |TT], but not in the others. 
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Figure 2: Unfolding of the /3-hairpin sequence, (a) Temperature dependence of the 
hydrophobicity energy Ei^p (see Eq. The sohd and dashed curves (essentially 
coinciding) are fits of the data to the two-state expression E'hp = (E^^+KEl^) / (l+K) 
and the square- well model (see text), respectively. The effective equilibrium constant 
K is assumed to have the first-order form K = exp[(l/fcT — l/kT^)AE]. Both fits 
have three free parameters, whereas = 297 K is held fixed, (b) Free-energy profile 
F{E) = —kT\nP{E) at T = Tm, obtained by reweighting |34j the data at a simulated 
T close to Tm. The shaded band is centered around the expected value and shows 
statistical la errors. The double-headed arrow indicates AE of the two-state fit. The 
dashed line shows F{E) for the square-well fit. 



Such a minimum exists in our model (see Fig. [T^), but the helix population is low. 

In spite of its minimalistic potential, our model is able to make a-helices too. To 
show this, we consider the a-helical so-called Fg peptide, which has been extensively 
studied both experimentally [T^ll4[ll5tll6j and theoretically [33]. This 21-amino acid 
peptide is given by AAAAA(AAARA)3A, where A is Ala and R is Arg. Using exactly 
the same model as before, with unchanged parameters, we find that the Fg sequence 
does make an a-helix. This can be seen from Fig. ^p, which shows the free energy 
F{A,E) at T = 273 K, A this time denoting rmsd from an ideal a-helix. F{A,E) 
has only one significant minimum, which indeed is helical. The melting behavior 
of this sequence is illustrated in Fig. which shows the temperature dependence 
of the hydrogen-bond energy. Data are again quite well described by a first-order 
two-state model; the x^/dof for the fit is 20.5 and would be 1.7 if the errors were 
1 %. Our fitted value of AE is 16.1 ± 0.9kcal/mol for Fg, which may be compared 
to the result AE = 12 ± 2kcal/mol obtained by a two-state fit of infrared (IR) 
spectroscopy data |15j. As in the /5-hairpin analysis, Tm is determined from the 
specific heat maximum (data not shown). For Fg, we obtain Tm = 310 K, which may 
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Figure 3: Unfolding of the Fg sequence, (a) Temperature dependence of the hydrogen- 
bond energy ii^hb (see Eq. Oj), with the same two types of fit as in Fig. ^ (same 
symbols), (b) Free-energy profile F{E) = —kT In P[E) at T = Tm. Same symbols as 
in Fig. 

be compared to the values = 303, 308 K and Tm = 334 K obtained by circular 
dichroism (CD) P^ITB] and IR spectroscopy [T^, respectively. Let us stress that 
for Fg is a prediction of the model; the energy scale of the model is set using Tm for 
the /3-hairpin and then left unchanged in our study of Fg. 

The two-state fits shown in Figs. EK and are based on a first-order expression for 
the free energies of the two coexisting phases. The fits look good and can be improved 
by including higher order terms, which may give the impression that the behaviors of 
these systems can be fully understood in terms of a two-state model. However, the 
two-state picture is far from perfect. This can be seen from the free-energy profiles 
F{E) shown in Figs, and ^jp, which lack a clear bimodal shape. Clearly, this 
renders the parameters of a two-state model, such as AE, ambiguous. The analysis 
of these systems therefore shows that the results of a two-state fit must be interpreted 
with care. Given the actual shapes of F{E), it is instructive to perform an alternative 
fit of the data in Figs. andEt', based on the assumptions that 1) F{E) has the 
shape of a square well of width AEg^ at T = T^, and that 2) the observable analyzed 
varies linearly with E^ These square- well fits are shown in Figs. |2K and|2ti', and the 
corresponding free-energy profiles F{E) (at T = Tm) are indicated in Figs. 1213 and 
EJd. The square-well fits are somewhat better than the two-state fits. However, the 

^With these two assumptions, one finds that the average value of an arbitrary observable O at 
temperature T is given by 0{T) = (0^(1 -t) + Oh)X*dt j A*dt = O" + (O^ - 0")(^ - j^j), 

where A = exp[(l/fcr— l/fcTm)Ai?sw] and and O'^ arc the values of O at the respective edges of 
the square well. 
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fitted curves are strikingly similar, given the large difference between the underlying 
energy distributions. This shows that it is very hard to draw conclusions about the 
free-energy profile F{E) from the temperature dependence of a single observable. 

From Figs.Eb andEb it can also be seen that the energy change AE obtained from the 
two-state fit is considerably smaller than the width of the energy distribution, which 
indicates that AE is smaller than the calorimetric energy change Aii^cai- Scholtz et 
al. j33 determined Aii^cai experimentally for an Ala-based helical peptide with 50 
amino acids, and obtained a value of 1.3kcal/mol per amino acid. This value cor- 
responds to a AEcai of 27.3 kcal/mol for the Fg peptide. Comparing model results 
for Ai^cai with experimental data is not straightforward, due to uncertainties about 
what the relevant baseline subtractions are jnEUHZUHHl • If we ignore baseline subtrac- 
tions and simply define AE'cai as the energy change between the highest and lowest 
temperatures studied, we obtain AE^ai = 45.6 ± 0.1 kcal/mol for Fg, which is larger 
than the value of Scholtz et al. To get an idea of how much this result can be 
affected by a baseline subtraction, a fit of our specific heat data is performed, to a 
two-state expression supplemented with a baseline linear in T. The fit function is 
a = AE,ai(l + i^)-2^ + co + ci(T-T^), where cq and Ci are baseline parameters and 
K = exp[(l/fcT — l/kT^)AE]. With Aii^cab ^E, cq, Ci and as free parameters, 
this fit gives Ai^^ai = 34.0 ± 1.0 kcal/mol (x^/dof = 5.2), which is considerably closer 
to the value of Scholtz et al. It may be worth noting that the corresponding 

fit without baseline subtraction is much poorer (x^/dof ~ 300). From these calcula- 
tions, we conclude that the model may overestimate AEcai, but it is not evident that 
the deviation is statistically significant, due to theoretical as well as experimental 
uncertainties. 

The melting behavior of helical peptides is often analyzed using the Zimm-Bragg ISH] 
or Lifson-Roig [30] models, which for large chain lengths are very different from the 
two-state model considered above. Our results for the Fg peptide are, nevertheless, 
quite well described by these models too. In fact, a fit of the helix content as a function 
of temperature to the Lifson-Roig model gives a x^/dof similar to that for the two- 
state fit above. Our fitted Lifson-Roig parameters are v = 0.016 ± 0.009 and w{T = 
273 K) = 1.86 ± 0.25, corresponding to the Zimm-Bragg parameters cr = 0.0003 ± 
0.0003 and s{T = 273 K) = 1.83±0.25 gj. In this fit the temperature dependence of 
w is given by a first-order two-state expression, whereas v is held constant. The energy 
change AE^ has a fitted value of 1.33 ± 0.17kcal/mol. The statistical uncertainties 
on V and cr are large because the chain is small, which makes the dependence on 

■^We define helix content in the foUowing way. Each amino acid, except the two at the ends, is 
labeled h if —90° < (j) < —30° and —77° < ip < — f7°, and c otherwise, j consecutive h's form a 
helical segment of length j — 2. The maximal number of amino acids in helical segments is then 
iV — 4 for a chain with N amino acids. 
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these parameters weak. Thompson et al. ^H] performed a Zimm-Bragg analysis 
of CD data for Fg, using the single-sequence approximation. Assuming a value of 
Ai?<j = 1.3kcal/mol for the energy change associated with helix propagation, they 
obtained a a of 0.0012. 

Our kinetic simulations of the two peptides are performed at their respective melting 
temperatures, Tm. Starting from equilibrium conformations at T = 366 K, we study 
the relaxation of ensemble averages under Monte Carlo dynamics (see Section 2.2). 
The ensemble consists of 1500 independent runs for each peptide. In Fig. |3J we show 
the "time" evolution of 50{t) = 0{t) — (O), where 0{t) is an ensemble average after t 
Monte Carlo steps, (O) is the corresponding equilibrium average, and the observable 
O is Ehp for the /3-hairpin and i?hb for Fg (same observables as in the thermodynamic 
calculations). Ignoring a brief initial period of rapid change, we find that the data, for 
both peptides, are fully consistent with single-exponential relaxation (x^/dof ~ 1), 
although the interval over which the signal 60 (t) can be followed is small in units of 
the relaxation time, especially for the /3-hairpin. Nevertheless, assuming the single- 
exponential behavior to be correct, a statistically quite accurate determination of 
the relaxation times can be obtained. The fitted relaxation time is approximately a 
factor of 5 larger for the /?-hairpin than for Fg. The corresponding factor is around 30 
for experimental data jHlElEl- A closer look at the /3-hairpin data shows that the 
hydrophobic cluster and the hydrogen bonds, on average, form nearly simultaneously 
in our model. This is in agreement with the results of Zhou et al. ^2], and in 
disagreement with the folding mechanism of Pande and Rokhsar in which the 
collapse occurs before the hydrogen bonds form. 

The two peptides studied in this paper make unusually clear-cut a- and /5-structure, 
respectively. It is clear that refinements of the interaction potential will be required 
in order to obtain an equally good description of more general sequences. One in- 
teresting refinement would be to make the strength of the hydrogen bonds context- 
dependent, that is dependent on whether the hydrogen bond is internal or exposed. 
This is probably needed in order for the model to capture, for example, the differ- 
ence between the Ala-based Fg peptide and pure polyalanine. In fact, it has been 
argued [^21112] that a major reason why Fg is a strong helix maker is that the Arg 
side chains shield the backbone from water and thereby make the hydrogen bonds 
stronger. The hydrogen bonds of a polyalanine helix lack this protection. In our 
model, the hydrogen bonds are context-independent, which could make polyalanine 
too helical. Although a direct comparison with experimental data is impossible due 
to its poor water solubility, simulations of polyalanine with 21 amino acids, A21, 
seem to confirm this. For A21, we obtain a helix content of about 80 % at T = 273 K, 
which is what we find for Fg too. Using a modified version of the Cornell et al. force 
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Figure 4: Monte Carlo relaxation of ensemble averages at T = Tm for the /3-hairpin 
and the Fg peptide. The deviation 60{t) from the equilibrium average (see text) is 
plotted against the number of elementary Monte Carlo steps, t. Straight lines are 
fits of the data to a single exponential. Data for t > 15 ■ 10^ are omitted for Fg due 
to large statistical errors. 

field jj43i , Garcia and Sanbonmatsu obtained a helix content of 34 % at T = 275 K 
for A21; the unmodified force field was found to give a value similar to ours at 
this temperature (but very different from ours at higher T) . Our estimate that Fg is 
~80% hehcal at T = 273 K is consistent with experimental data [TBIITB] . 

We also looked at two other helical peptides. The first of these is the Ala-based 16- 
amino acid peptide (AEAAK)3A, where E is Glu and K is Lys. By CD, Marqusee and 
Baldwin found this peptide to be ~ 50 % helical at T = 274 K. In our model the 
corresponding value turns out to be ~ 70 %. Our last example is the 38-59-fragment 
of the B domain of staphylococcal protein A (PDB code IBDD). This is a more 
general, not Ala-based sequence, containing three hydrophobic Leu. By CD, Bai et 
al. obtained a helix content of ~ 30 % at pH 5.2 and T = 278 K for this fragment. 
In our model we obtain a helix content of ~ 20 % at this temperature. So, the model 
predicts helix contents that are in approximate agreement with experimental data 
for Fg, (AEAAK)3A as well as the protein A fragment. 



4 Summary and Outlook 



We have developed and explored a protein model that combines an all-atom repre- 
sentation of the amino acid chain with a minimalistic sequence-based potential. The 



13 



strength of the model is the simphcity of the potential, which at the same time, of 
course, means that there are many interesting features of real proteins that the model 
is unable to capture. One advantage of the model is that the calibration of parame- 
ters, which any model needs, becomes easier to carry out with fewer parameters to 
tune. 

When calibrating the model, our goal was to ensure that, without resorting to param- 
eter changes, our two sequences made a /5-hairpin with the native topology and an 
a-helix, respectively, which was not an easy task. Once this goal had been achieved, 
our thermodynamic and kinetic measurements were carried out without any further 
fine-tuning of the potential. Therefore, it is hard to beheve that the generally quite 
good agreement between our thermodynamic results and experimental data is acci- 
dental. A more plausible explanation of the agreement is that the thermodynamics 
of these two sequences indeed are largely governed by backbone hydrogen bonding 
and hydrophobic collapse forces, as assumed by the model. The requirement that 
the two sequences make the desired structures is then sufficient to quite accurately 
determine the strengths of these two terms. 

The main results of our calculations can be summarized as follows. 

• Our thermodynamic simulations show first of all that the two sequences studied 
indeed make a /3-hairpin with the native topology and an a-helix, respectively. 
The main reason why the model favors the native topology over the non-native 
one for the /3-hairpin, is that the formation of the hydrophobic cluster is ster- 
ically difficult to accomplish in the non-native topology. The melting curves 
obtained for the two peptides are in reasonable agreement with experimental 
data, and can to a good approximation be described by a simple two-state 
model. 

• A two-state description of the thermodynamic behavior is, nevertheless, found 
to be an oversimplification for both peptides, as can be seen from the energy 
distributions. Given that the systems are small and fluctuations therefore rela- 
tively large, this is maybe not surprising. What is striking is how difficult it is to 
detect these deviations from two-state behavior when studying the temperature 
dependence of a single observable. 

• The results of our Monte Carlo-based kinetic runs at the respective melting 
temperatures are, for both peptides, consistent with single-exponential relax- 
ation, and the relaxation time is found to be larger for the /9-hairpin than for 
Fs. 
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Extending these calculations to larger chains will impose new conditions on the in- 
teraction potential, and thereby make it possible (and necessary) to refine it. Two 
interesting refinements would be to make the treatment of charged side chains and 
side-chain hydrogen bonds less crude and to introduce a mechanism for screening 
of hydrogen bonds |S3liniE2lllZl • Computationally, there is room for extending the 
calculations. In fact, simulating the thermodynamics of a chain with about 20 amino 
acids, with high statistics, does not take more than a few days on a standard desktop 
computer, in spite of the detailed geometry of the model. This gives us hope to be 
able to look into the free-energy landscape and two-state character of small proteins 
in a not too distant future. 
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